68 research outputs found
Emergence of Addictive Behaviors in Reinforcement Learning Agents
This paper presents a novel approach to the technical analysis of wireheading
in intelligent agents. Inspired by the natural analogues of wireheading and
their prevalent manifestations, we propose the modeling of such phenomenon in
Reinforcement Learning (RL) agents as psychological disorders. In a preliminary
step towards evaluating this proposal, we study the feasibility and dynamics of
emergent addictive policies in Q-learning agents in the tractable environment
of the game of Snake. We consider a slightly modified settings for this game,
in which the environment provides a "drug" seed alongside the original
"healthy" seed for the consumption of the snake. We adopt and extend an
RL-based model of natural addiction to Q-learning agents in this settings, and
derive sufficient parametric conditions for the emergence of addictive
behaviors in such agents. Furthermore, we evaluate our theoretical analysis
with three sets of simulation-based experiments. The results demonstrate the
feasibility of addictive wireheading in RL agents, and provide promising venues
of further research on the psychopathological modeling of complex AI safety
problems
- …